Load the necessary libraries
library(car) #for regression diagnostics
library(broom) #for tidy output
library(broom.mixed)
library(ggfortify) #for model diagnostics
library(sjPlot) #for outputs
library(knitr) #for kable
library(effects) #for partial effects plots
library(ggeffects) #for effects plots in ggplot
library(emmeans) #for estimating marginal means
library(MASS) #for glm.nb
library(MuMIn) #for AICc
library(tidyverse) #for data wrangling
library(DHARMa) #for residuals and diagnostics
library(nlme) #for lme
library(lme4) #for glmer
library(glmmTMB) #for glmmTMB
library(performance) #for diagnostic plots
library(see) #for diagnostic plots
theme_set(theme_classic())
To investigate synergistic coral defence by mutualist crustaceans, Mckeon et al. (2012) conducted an aquaria experiment in which colonies of a coral species were placed in a tank along with a preditory seastar and one of four symbiont combinations:
The experiments were conducted in a large octagonal flow-through seawater tank that was partitioned into eight sections, which thereby permitted two of each of the four symbiont combinations to be observed concurrently. The tank was left overnight and in the morning, the presence of feeding scars on each coral colony was scored as evidence of predation. The experiments were repeated ten times, each time with fresh coral colonies, seastars and symbiont.
The ten experimental times represent blocks (random effects) within which the symbiont type (fixed effect) are nested.
mckeon <- read_csv('../data/mckeon.csv', trim_ws=TRUE) %>%
janitor::clean_names() %>%
mutate(
block = factor(block),
symbiont = fct_relevel(symbiont, c("none", "crabs", "shrimp", "both")))
## Parsed with column specification:
## cols(
## BLOCK = col_double(),
## PREDATION = col_double(),
## SYMBIONT = col_character()
## )
glimpse(mckeon)
## Rows: 80
## Columns: 3
## $ block <fct> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 1…
## $ predation <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,…
## $ symbiont <fct> none, none, none, none, none, none, none, none, none, none,…
Model formula: \[ y_i \sim{} \mathcal{N}(n, p_i)\\ ln\left(\frac{p_i}{1-p_1}\right) =\boldsymbol{\beta} \bf{X_i} + \boldsymbol{\gamma} \bf{Z_i} \]
where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) are vectors of the fixed and random effects parameters respectively and \(\bf{X}\) is the model matrix representing the overall intercept and effects of symbionts on the probability of the colony experiencing predation. \(\bf{Z}\) represents a cell means model matrix for the random intercepts associated with individual coral colonies.
ggplot(mckeon, aes(y=predation, x=symbiont)) +
geom_point(position=position_jitter(width=0.2, height=0))+
facet_wrap(~block)
Note that we could not analyze the data if all the blocks didn’t have any overlap (i.e. if all the individuals died or lived in specific circumstances)
mckeon_glmm <- glmmTMB(predation ~ symbiont + (1|block),
data = mckeon, family = binomial(link = "logit"), REML=T)
# mckeon_glmm2 <- glmmTMB(predation ~ symbiont + (symbiont|block),
# data = mckeon, family = binomial(link = "logit"), REML=T) # did not run properly
sim_resid <- mckeon_glmm %>% simulateResiduals(plot=T)
testOverdispersion(sim_resid)
## testOverdispersion is deprecated, switch your code to using the testDispersion function
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## ratioObsSim = 1.0036, p-value = 0.928
## alternative hypothesis: two.sided
plot_model(mckeon_glmm, type = 'eff')
## $symbiont
plot(allEffects(mckeon_glmm))
ggemmeans(mckeon_glmm, ~symbiont) %>% plot(add.data=T) # marginal effects
ggpredict(mckeon_glmm) %>% plot # predictions
## $symbiont
summary(mckeon_glmm)
## Family: binomial ( logit )
## Formula: predation ~ symbiont + (1 | block)
## Data: mckeon
##
## AIC BIC logLik deviance df.resid
## 62.9 74.8 -26.4 52.9 79
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## block (Intercept) 17.83 4.223
## Number of obs: 80, groups: block, 10
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.420 1.782 2.481 0.013109 *
## symbiontcrabs -3.317 1.313 -2.526 0.011534 *
## symbiontshrimp -3.956 1.416 -2.793 0.005218 **
## symbiontboth -5.152 1.565 -3.291 0.000997 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## For no symbiont colonies:
exp(4.420) # 83:1 odds of being predated
## [1] 83.09629
## For crab symbiont colonies:
exp(-3.317) # x0.03626145 fractional decrease
## [1] 0.03626145
exp(4.420 + -3.317) # 3:1 odds of being predated
## [1] 3.013192
plogis(4.420 + -3.317) # probability of predation: 0.7508218
## [1] 0.7508218
## For crab symbiont colonies:
exp(-3.317) # x0.03626145 fractional decrease
## [1] 0.03626145
exp(4.420 + -3.317) # 3:1 odds of being predated
## [1] 3.013192
plogis(4.420 + -3.317) # probability of predation: 0.7508218
## [1] 0.7508218
Intercept: Prob of no symbiont individuals being predated is: 83:1 Prob of crab symbiont colonies being predated declines by a factor of 0.03626145, so: exp(4.420 + -3.317) = 3.01:1 odds of being predated Prob of shrimp symbiont colonies being predated declines by a factor of
mckeon_tidy <- tidy(mckeon_glmm)
tidy(mckeon_glmm, effect='fixed', conf.int=TRUE)
tidy(mckeon_glmm, effect='fixed', conf.int=TRUE, exponentiate=TRUE)
Conclusions:
emmeans(mckeon_glmm, ~ symbiont) %>%
as.data.frame() %>%
mutate(emmean = exp(emmean))
mckeon_glmm %>%
emmeans(~ symbiont) %>% # get differences
regrid() #%>% # does backtransform of the differences BEFORE pair-wise comparisons
## symbiont prob SE df lower.CL upper.CL
## none 0.988 0.0209 79 0.946 1.03
## crabs 0.751 0.3021 79 0.150 1.35
## shrimp 0.614 0.3790 79 -0.140 1.37
## both 0.325 0.3458 79 -0.364 1.01
##
## Confidence level used: 0.95
# pairs() %>% # get pair-wise absolute values of differences
# confint()
Can only do 3 planned contrasts, as there are 4 groups, so k-1 possible to evaluate.
Order is: “none,” “crabs,” “shrimp,” “both”
0, 1, -1, 0
0, 0.5, 0.5, -1
1, -1/3, -1/3, -1/3
cmat <- cbind('crab_vs_shrimp' = c(0, 1, -1, 0),
'one_vs_twosym' = c(0, 0.5, 0.5, -1),
'nosym_vs_sym' = c(1, -1/3, -1/3, -1/3))
cmat
## crab_vs_shrimp one_vs_twosym nosym_vs_sym
## [1,] 0 0.0 1.0000000
## [2,] 1 0.5 -0.3333333
## [3,] -1 0.5 -0.3333333
## [4,] 0 -1.0 -0.3333333
colSums(cmat) # need to sum to zero
## crab_vs_shrimp one_vs_twosym nosym_vs_sym
## 0.000000e+00 0.000000e+00 5.551115e-17
crossprod(cmat) # needs to have only zeros on the off-diagonal (orthogonal)
## crab_vs_shrimp one_vs_twosym nosym_vs_sym
## crab_vs_shrimp 2 0.0 0.000000
## one_vs_twosym 0 1.5 0.000000
## nosym_vs_sym 0 0.0 1.333333
mckeon_glmm %>%
emmeans(~symbiont, contr = list(symbiont = cmat), type = 'response') #'ratio' is on the log scale
## $emmeans
## symbiont prob SE df lower.CL upper.CL
## none 0.988 0.0209 79 0.7055 1.000
## crabs 0.751 0.3021 79 0.1081 0.987
## shrimp 0.614 0.3790 79 0.0619 0.975
## both 0.325 0.3458 79 0.0204 0.917
##
## Confidence level used: 0.95
## Intervals are back-transformed from the logit scale
##
## $contrasts
## contrast odds.ratio SE df t.ratio p.value
## symbiont.crab_vs_shrimp 1.89 2.18 79 0.556 0.5800
## symbiont.one_vs_twosym 4.55 4.75 79 1.454 0.1499
## symbiont.nosym_vs_sym 62.91 79.54 79 3.276 0.0016
##
## Tests are performed on the log odds ratio scale
mckeon_glmm %>%
emmeans(~symbiont, type = 'response') %>%
contrast(list(cmat)) # returns just the table
## contrast odds.ratio SE df t.ratio p.value
## crab_vs_shrimp 1.89 2.18 79 0.556 0.5800
## one_vs_twosym 4.55 4.75 79 1.454 0.1499
## nosym_vs_sym 62.91 79.54 79 3.276 0.0016
##
## Tests are performed on the log odds ratio scale
Clearly there is only a difference between the no-symbiont vs. symbiont groups.
Note that the odds ratio here is an odds ratio of the groups’ odds ratios (so the odds of a colony with no symbionts being predated is 63 times higher than the odds for colonies with any type of symbiont)
# mckeon_glmm %>%
# emmeans(~symbiont) %>%
# regrid() %>%
# contrast(list(symbiont = cmat)) %>%
# confint()
R^2:
r.squaredGLMM(mckeon_glmm)
## Warning: 'r.squaredGLMM' now calculates a revised statistic. See the help page.
## Warning: The null model is correct only if all variables used by the original
## model remain unchanged.
## R2m R2c
## theoretical 0.1489321 0.8674549
## delta 0.1437647 0.8373578
performance::r2_nakagawa(mckeon_glmm)
## # R2 for Mixed Models
##
## Conditional R2: 0.867
## Marginal R2: 0.149
Delta method of R^2 can be used for any models. But there are some select distributions that R^2 theoretical applies to, so usually we just use whichever ones the nakagawa function provides. Just don’t use theoretical for lognormal(gaussian with a log-link) or gamma(log-link) distributions.
Only about 15% for marginal (fixed effects only) and 86% for conditional R^2, so clearly a lot is explained by individual colony variation in predation!
# warning this is only appropriate for html output
# sjPlot::tab_model(mckeon_glmm, show.se=TRUE, show.aic=TRUE)
emmeans(mckeon_glmm, ~symbiont, type='response') %>%
as.data.frame %>%
ggplot(aes(y=prob, x=symbiont)) +
geom_pointrange(aes(ymin=lower.CL, ymax=upper.CL))
# References